Frequency Domain Analysis of MFCC Feature Extraction in Children’s Speech Recognition System
نویسندگان
چکیده
Abstract —The research on speech recognition systems currently focuses the analysis of robust systems. When signals are combined with noise, system becomes distracted, struggling to identify sounds. Therefore, development a continues be carried out. The principle is eliminate noise from and restore original information signals. In this paper, researchers conducted frequency domain one stage Mel Frequency Cepstral Coefficients (MFCC) process, Fast Fourier Transform (FFT), in children's system. FTT feature extraction process determined effect value characteristics utilized FFT output disruption. method was designed into three scenarios based employed points. differences between were number shared All points divided four, three, two parts first, second, third scenarios, respectively. This study data isolated TIDIGIT English digit corpus. As comparative data, added manually simulate real-world conditions. results showed that using particular portion following scenario MFCC affected performance, which relatively significant noisy data. 3 (C1) version generated highest accuracy, exceeded accuracy conventional method. average increased by 1% more than all tested types. Using various intensity values (SNR), testing indicates generates higher SNR values. It proves selection specific significantly affects speech.
منابع مشابه
Face Images Feature Extraction Analysis for Recognition in Frequency Domain
In this paper a novel technique to extract facial features for recognition in frequency domain using Discrete Fourier Transform (DFT) is presented. In pre processing phase facial tilt and varying image background challenges have been addressed to improve the success rate. Varying facial expressions within class have been minimised by using decimation algorithm. Experiments on ORL and YALE datas...
متن کاملFeature Extraction Using Mfcc
Mel Frequency Ceptral Coefficient is a very common and efficient technique for signal processing. This paper presents a new purpose of working with MFCC by using it for Hand gesture recognition. The objective of using MFCC for hand gesture recognition is to explore the utility of the MFCC for image processing. Till now it has been used in speech recognition, for speaker identification. The pres...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملIntoxicated Speech Detection using MFCC Feature Extraction and Vector Quantization
This study has been done on a technique which is suitable for tapping the telephonic conversation from a remote location to identify intoxication and consequent impaired brain activity that may cause criminal events e.g. DUI (driving under influence). This technique is time efficient, easy to use, non–invasive for the peoples and affordable for law enforcement personnel, bartenders/servers, cou...
متن کاملFeature extraction from time-frequency matrices for robust speech recognition
In this paper we present a study about time-frequency distribution of acoustic-phonetic information for the Spanish language. This is based on a large Spanish database automatically labeled, and we conclude that results are similar to those obtained for hand-labeled english databases. We use bidimensional LDA [1] to extract discriminant features in time-frequency domain (TF) that are more robus...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Jurnal Infotel
سال: 2022
ISSN: ['2460-0997', '2085-3688']
DOI: https://doi.org/10.20895/infotel.v14i1.740